A regular expression is a pattern that describes a set of
strings. Regular expressions are constructed analogously to
arithmetic expressions, by using various operators to
combine smaller expressions.
GGGGrrrreeeepppp understands two different versions of regular
expression syntax: ``basic'' and ``extended.'' In
GNU ggggrrrreeeepppp, there is no difference in available functionality
using either syntax. In other implementations, basic
regular expressions are less powerful. The following
description applies to extended regular expressions;
differences for basic regular expressions are summarized
afterwards.
The fundamental building blocks are the regular expressions
that match a single character. Most characters, including
all letters and digits, are regular expressions that match
themselves. Any metacharacter with special meaning may be
quoted by preceding it with a backslash.
A list of characters enclosed by [[[[ and ]]]] matches any single
character in that list; if the first character of the list
is the caret ^^^^ then it matches any character _n_o_t in the
list. For example, the regular expression [[[[0000111122223333444455556666777788889999]]]]
matches any single digit. A range of characters may be
specified by giving the first and last characters, separated
by a hyphen. Finally, certain named classes of characters
are predefined. Their names are self explanatory, and they
are [[[[::::aaaallllnnnnuuuummmm::::]]]], [[[[::::aaaallllpppphhhhaaaa::::]]]], [[[[::::ccccnnnnttttrrrrllll::::]]]], [[[[::::ddddiiiiggggiiiitttt::::]]]], [[[[::::ggggrrrraaaapppphhhh::::]]]],
[[[[::::lllloooowwwweeeerrrr::::]]]], [[[[::::pppprrrriiiinnnntttt::::]]]], [[[[::::ppppuuuunnnncccctttt::::]]]], [[[[::::ssssppppaaaacccceeee::::]]]], [[[[::::uuuuppppppppeeeerrrr::::]]]], and
[[[[::::xxxxddddiiiiggggiiiitttt::::]]]].... For example, [[[[[[[[::::aaaallllnnnnuuuummmm::::]]]]]]]] means [[[[0000----9999AAAA----ZZZZaaaa----zzzz]]]],
except the latter form depends upon the POSIX locale and the
ASCII character encoding, whereas the former is independent
of locale and character set. (Note that the brackets in
these class names are part of the symbolic names, and must
be included in addition to the brackets delimiting the
bracket list.) Most metacharacters lose their special
meaning inside lists. To include a literal ]]]] place it first
in the list. Similarly, to include a literal ^^^^ place it
anywhere but first. Finally, to include a literal ---- place
it last.
The period .... matches any single character. The symbol \\\\wwww is
a synonym for [[[[[[[[::::aaaallllnnnnuuuummmm::::]]]]]]]] and \\\\WWWW is a synonym for
[[[[^^^^[[[[::::aaaallllnnnnuuuummmm]]]]]]]].
The caret ^^^^ and the dollar sign $$$$ are metacharacters that
respectively match the empty string at the beginning and end
of a line. The symbols \\\\<<<< and \\\\>>>> respectively match the
empty string at the beginning and end of a word. The symbol
\\\\bbbb matches the empty string at the edge of a word, and \\\\BBBB
matches the empty string provided it's _n_o_t at the edge of a
GGGGRRRREEEEPPPP____OOOOPPPPTTTTIIIIOOOONNNNSSSS is ''''--------bbbbiiiinnnnaaaarrrryyyy----ffffiiiilllleeeessss====wwwwiiiitttthhhhoooouuuutttt----mmmmaaaattttcccchhhh
--------ddddiiiirrrreeeeccccttttoooorrrriiiieeeessss====sssskkkkiiiipppp'''', ggggrrrreeeepppp behaves as if the two options
--------bbbbiiiinnnnaaaarrrryyyy----ffffiiiilllleeeessss====wwwwiiiitttthhhhoooouuuutttt----mmmmaaaattttcccchhhh and --------ddddiiiirrrreeeeccccttttoooorrrriiiieeeessss====sssskkkkiiiipppp had
been specified before any explicit options. Option
specifications are separated by whitespace. A
backslash escapes the next character, so it can be used